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DETAILED ACTION 
Claim Rejections - 35 USC §112 



1. Claim 40 recites the limitation "the conventional browser". There is insufficient 
antecedent basis for this limitation in the claim. 

2. Claims 45 and 48 recite the limitation "the relevant Application Service Provider". 
There is insufficient antecedent basis for this limitation in the claim. 



Claim Rejections - 35 USC § 103 
3. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1 148 
USPQ 459 (1966) , that are applied for establishing a background for determining 
obviousness under 35 U.S.C. 103(a) are summarized as follows: (See MPEP Oh. 
2141) 

a. Determining the scope and contents of the prior art; 

b. Ascertaining the differences between the prior art and the claims in issue; 

c. Resolving the level of ordinary skill in the pertinent art; and 

d. Evaluating evidence of secondary considerations for indicating 
obviousness or nonobviousness. 
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4. Claims 28-32, 34-35, 39, 46-47, 53 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Suryanarayana, Lalitha, USPGPUB 20030112791 (herein after 
Lalitha) in view of Kredo et al, US 6816578 B1 (herein after Kredo). 

Re claims 28-29, 34, and 53, "A system for allowing multi-modal access of 
content over a global data communications network using a mobile station (MS) with a 
user agent, a proxy server, and a telephony platfomri, wherein: " 

Lalitha teaches a wireless environment comprising a mobile station as well as a 
wireless area protocol (WAP) proxy and voice-xml server (Lalitha claim 23). 

"said mobile station is a dual mode station supporting concurrent voice and data 
sessions" 

Lalitha teaches a multi-modal interface process where the user is accessing the 
Web site via a multi-modal wireless device. In this instance, multi-modal refers to the 
user agent supporting voice as well as data simultaneously for input and output on a 
user interface (Lalitha [0069]). 

"said proxy server comprises an enhanced functionality for supporting voice 
browsing;" 

Lalitha teaches the WAP proxy invoking a voice browsing web service (Lalitha 
[0085] & fig. 9) and a user agent invoking a wireless telephony application interface to 
allow devices to communicate (Lalitha [0083]). Lalitha also teaches the proxy user 
agent that automatically initiates a call to the voice xml web services. Lalitha teaches a 
rule based language where a user can express his or her preference in a rule set that is 
used by a software agent to make automated or semi-automated decisions regarding 
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the acceptability of machine-readable privacy policies from P3P enabled Web sites 
(Lalitha [0006]). 

"said telephony platform comprises an Automatic Speech Recognizer (ASR) and 
is operative to convert text messages to speech;" 

Lalitha teaches the Web service supports functions such as the ability to pertomn 
text-to-speech conversion and/or speech recognition, generate VXML compatible Web 
pages, and/or traverse them (Lalitha [0089]). 

"key elements are predefined and indicated in the original web content;" 

"when the proxy server recognizes/extracts said key elements, using predefined 
rules, it triggers voice browsing, such that arbitrary web content can be accessed by 
voice commands without requiring conversion of the web content." 

Lalitha also teaches the user agent process to retrieve natural language based 
on the user action such as key depression or voice command (Lalitha [0047]). However 
Lalitha fails to teach the recognition of key elements relevant to the proxy server. Kredo 
teaches that speech recognition technology is effective and reliable in recognizing pre- 
defined words and phrases permitting the formation of a limited vocabulary or language 
(Kredo col 5 line 26-40). Recognized words or phrases are construed to be key 
elements within web content. Therefore, the combined teaching of Lalitha and Kredo as 
a whole would have rendered obvious multi-modal access of content using a mobile 
station, user agent, proxy sever, and telephony platform implementing speech 
recognition, rules, text to speech conversion and voice browsing. 
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Within claim 53, the combined teaching of Lalitha and Kredo fail to disclose a 
hyperlink associated with web content. However examiner takes official notice that it is 
well known to have hyperlinks within web content as part of html. The combined 
teaching discloses web servers providing web content such as html (Lalitha [0038]). 

Re claim 30, "wherein the proxy server parses an accessed web content with 
regard to said key elements", the combined teaching of Lalitha and Kredo disclose a 
web site server able to parse the user preference (Lalitha [0072]). The combined 
teaching of Lalitha and Kredo disclose a proxy server parsing a query having each 
possible acceptable answer delineated (Kredo col 1 line 49-63). Therefore, the 
combined teaching of Lalitha and Kredo as a whole would have rendered obvious the 
proxy server parsing web content with regard to key elements. 

Re claim 31 , "accessed web content is browsed by means of key strokes or 
mouse clicks", the combined teaching of Lalitha and Kredo disclose the device user 
inputting information and operating the device by the keypad (Lalitha [0023] & fig. 1). 

Re claim 32, "allows for voice-based access of any tag based content", the 
combined teaching of Lalitha and Kredo disclose web servers that communicate using 
HTTP in order to render content that is marked up using XHTML (Lalitha [0036]). A tag 
is construed as a type of markup. 

Re claim 35, "the proxy server interfaces with the Automatic Speech Recognizer 
which comprises a medium size vocabulary speech recognizer", the combined teaching 
of Lalitha and Kredo disclose the WAP proxy invoking a voice browsing web service 
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(Lalitha [0085] & fig. 9) and a user agent invoking a wireless telepliony application 
interface to allow devices to communicate (Lalitha [0083]). The combined teaching also 
discloses that speech recognition technology is effective and reliable in recognizing pre- 
defined words and phrases permitting the fomiation of a limited vocabulary or language 
(Kredo col 5 line 26-40). A medium size vocabulary is construed to be a limited 
vocabulary if the vocabulary is not recited to be full. Therefore, the combined teaching 
of Lalitha and Kredo as a whole would have rendered obvious a proxy server interfacing 
with an automatic speech recognizer having a medium size vocabulary. 

Re claim 39, "the proxy server forwards text prompts to a text-to-speech function 
in the telephony Platform, wherein the text messages are converted to speech and 
fonwarded to the user over the voice channel set up by the proxy server", the. combined 
teaching of Lalitha and Kredo disclose the deliverance of text queries and translation of 
text or VXML to deliverable audio that a user receives (Kredo fig. 2B). The combined 
teaching also discloses a general network linking servers, an audio browser, and mobile 
stations (Kredo fig. 1). 

Re claim 46, "a request for voice browsing includes at least a voice browsing 
session ID and MSISDN of the user station", the combined teaching of Lalitha and 
Kredo disclose a \NAP proxyA/XML gateway then transforms the natural language 
policy to VXML and generates a user policy identification number The user policy ID is 
transmitted back to the user agent in the wireless device (806). The policy ID 
associates a particular natural language policy with a certain user since there may be 
multiple users simultaneously requiring transformed natural language policies (Lalitha 
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[0081]). The combined teacliing discloses a WAP proxy involving a voice browsing web 
service (Lalltha [0085] & fig. 9) and a user agent invoking a wireless telephony 
application interface to allow devices to communicate (Lalitha [0083]) as well as an 
audio browser connected with a proxy server within a telephone network (Kredo fig. 1). 
The combined teaching also discloses a network provider providing storage of 
telephone numbers and addresses for the telephone user that the user can access 
through a WA server on a wireless user agent (Lalitha [0087]). A MSISDN is construed 
as the telephone number of a mobile device. Therefore, the combined teaching of 
Lalitha and Kredo as a whole would have rendered obvious voice browsing having an 
ID number and MSISDN (telephone number). 

Re claim 47, "a user authenticated by the proxy server, a voice channel is 
established, concurrent with a data session channel, between the ASR and the mobile 
station", the combined teaching of Lalitha and Kredo disclose a wireless environment 
comprising a mobile station as well as a wireless area protocol (WAP) proxy and voice- 
xml server (Lalitha claim 23). The combined teaching also discloses the proxy user 
agent that automatically initiates a call to the voice xml web services. The combined 
teaching teaches the WAP proxy invoking a voice browsing web service (Lalitha [0085] 
& fig. 9) and a user agent invoking a wireless telephony application interface to allow 
devices to communicate (Lalitha [0083]). Additionally, the combined teaching discloses 
a proxy server that Identifies a caller and accesses the users profile that includes 
passwords, logins, and preferences for the service (Kredo col 5 line 41-54) and the 
proxy server also identifies a user by processing identification information (Kredo col 5 
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line 55-63). Authentication is construed as tlie confirming of the identify of a user. 
Therefore, the combined teaching of Lalitha and Kredo as a whole would have rendered 
obvious authentication by a proxy server and a voice channel established between an 
ASR and a mobile station. 

5. Claims 33, 37-38, 43 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Suryanarayana, Lalitha, USPGPUB 20030112791 (herein after 
Lalitha) in view of Kredo et al, US 6816578 B1 (herein after Kredo) and further in 
view of Rhie et al US 5953392 A (herein after Rhie). 

Re claim 33, "user of the mobile station uses a key element indicated in the web 
content to select a specific hyperlink", the combined teaching of Lalitha and Kredo 
disclose a user agent process to retrieve natural language based on the user action 
such as key depression or voice command (Lalitha [0047]). However the combined 
teaching fails to disclose using content to select a hyperlink. Rhie teaches a system 
that converts the infonnation content of a web page from text to speech (voice signals), 
signals the hyperlink selections of a web page in an audio manner, and allows selection 
of the hyperlinks (Rhie col 2 line 12-24). Therefore, the combined teaching of Lalitha, 
Kredo, and Rhie as a whole would have rendered obvious selecting a specific hyperlink 
indicated by key elements within web content. 

Claim 37 has been analyzed and rejected with respect to claim 33. Claim 33 
teaches the limitation of claim 37. A unique keyword is broad and is construed as a 



Application/Control Number: 10/519,640 Page 9 

Art Unit: 2626 

specific keyword. A simple rule is broad and is construed as a preference rule for a 
language. 

Re claim 38, "predefined rules for voice key element extraction are numeric rules 
numbering hyperlinks in said web content", the combined teaching of Lalitha, Kredo, 
and Rhie disclose a rule based language where a user can express his or her 
preference in a rule set that is used by a software agent to make automated or semi- 
automated decisions regarding the acceptability of machine-readable privacy policies 
from P3P enabled Web sites (Lalitha [0006]). The combined teaching also discloses 
that in order for the user to access a hyperlink on the web page, the first web page 
needs to be faxed back to the user with the hyperlinks numerically annotated for 
reference (Rhie col 1 line 46-60). Therefore, the combined teaching of Lalitha, Kredo, 
and Rhie as a whole would have rendered obvious numeric rules numbering hyperlinks 
in web content. 

Re claim 43, "a connection is established between the proxy server and the 
Automatic Speech Recognizer of the telephony platform for specifying and identifying a 
called application to be accessed", the combined teaching of Lalitha and Kredo disclose 
a proxy server (Kredo fig. 1) connected with an audio browser composed of a speech 
synthesizer and speech recognition software used to generate instructions for a 
telephony user (Kredo col 9 line 9-29). The combined teaching also discloses a WAP 
pmxyA/XML gateway then transforms the natural language policy to VXML and 
generates a user policy identification number. The user policy ID is transmitted back to 
the user agent in the wireless device (806). The policy ID associates a particular 
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natural language policy with a certain user since there may be multiple users 
simultaneously requiring transformed natural language policies (Lalitha [0081]). The 
combined teaching discloses an application call flow applying the policy identification 
number (Lalitha fig. 7). Therefore, the combined teaching of Lalitha, Kredo, and Rhie as 
a whole would have rendered obvious a proxy server and automatic speech recognizer 
to specify and identify a called application to be accessed. 

6. Claim 36 is rejected under 35 U.S.C. 103(a) as being unpatentable over 
Suryanarayana, Lalitha, USPGPUB 20030112791 (herein after Lalitha) in view of 
Kredo et al, US 6816578 B1 (herein after Kredo) and further in view of Groner US 
6507643 B1. 

Re claim 36, "predefined rules for voice key element extraction are syntactic 
rules", the combined teaching of Lalitha and Kredo disclose a rule based language 
where a user can express his or her preference in a rule set that is used by a software 
agent to make automated or semi-automated decisions regarding the acceptability of 
machine-readable privacy policies from P3P enabled Web sites (Lalitha [0006]). 
However the combined teaching fails to disclose the rules to be syntactic. Groner 
teaches, a syntax-by-rule speech recognition procedure 144 to recognize predefined 
known categories of speech (Groner col 6 line 45-51). Therefore, the combined 
teaching of Lalitha, Kredo, and Groner as a whole would have rendered obvious 
elements extracted using predefined syntactic rules. 
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7. Claims 40-42, 44-45, 48-52, 54 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Suryanarayana, Lalitha, USPGPUB 20030112791 (herein after 
Lalitha) in view of Kredo et al, US 6816578 B1 (herein after Kredo) and further in 
view of Gong et al US 7177814 B2 (herein after Gong). 

Re claim 40, "between the conventional browser in the user agent and the speech 
browser in the proxy server a synchronization engine is provided", the combined 
teaching of Lalitha and Kredo both teach a WAP proxy invoking a voice browsing web 
service (Lalitha [0085] & fig. 9) and a user agent invoking a wireless telephony 
application interface to allow devices to communicate (Lalitha [0083]) as well as an 
audio browser connected with a proxy server within a telephone network (Kredo fig. 1). 
The combined teaching also discloses a speech browser between two different user 
networks (Kredo fig. 1). However the combined teaching fails to disclose 
synchronization between components within a network. Gong teaches a system for 
synchronizing multiple modes (Gong col 9 line 33-39 & fig. 1). Therefore, the combined 
teaching of Lalitha, Kredo, and Gong as a whole would have rendered obvious 
synchronization between a browser in the user agent and a browser in the proxy server. 

Re claim 41 , "the proxy server comprises a pushing mechanism for making the 
MS user agent refresh indicated, fetched content", the combined teaching of Lalitha and 
Kredo both teach a WAP proxy invoking a voice browsing web service (Lalitha [0085] & 
fig. 9) and a user agent invoking a wireless telephony application interface to allow 
devices to communicate (Lalitha [0083]). However the combined teaching fails to 
disclose a pushing mechanism to have the user agent refresh content Gong teaches a 
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server-push process for synchronizing a browser after a voice gateway requests a 
VXML page (Gong col 4 line 12-14) and sends a message indicating a corresponding 
HTML page and updating an HTML page (Gong fig. 3). Therefore, the combined 
teaching of Lalitha, Kredo, and Gong as a whole would have rendered obvious a proxy 
server with a push mechanism to refresh content. 

Re claim 42, "a semaphore object is introduced into the content returned to the 
proxy server for indicating activation or not of content refresh", the combined teaching of 
Lalitha and Kredo both teach a WAP proxy invoking a voice browsing web service 
(Lalitha [0085] & fig. 9) and a user agent invoking a wireless telephony application 
interface to allow devices to communicate (Lalitha [0083]). However the combined 
teaching fails to disclose a semaphore object to indicate if content was refreshed. Gong 
teaches a server-push process for synchronizing a browser after a voice gateway 
requests a VXML page (Gong col 4 line 12-14). Gong also teaches a voice gateway 
that requests a VXML, a server that sends a message indicating a corresponding HTML 
page to a browser, and a browser updating an HTML page (Gong fig. 3). A semaphore 
is construed as an object used for the allowance of synchronization and communication. 
Therefore, the combined teaching of Lalitha, Kredo, and Gong as a whole would have 
rendered obvious a semaphore for indicating if content was refreshed. 
Claims 44, 45, 48, and 49: 

The combined teaching of Lalitha and Kredo disclose a wireless environment 
comprising a mobile station as well as a wireless area protocol (WAP) proxy and voice- 
xml server (Lalitha claim 23). Lalitha also teaches the proxy user agent that 
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automatically initiates a call to the voice xml web services. Lalitha teaches the WAP 
proxy Invoking a voice browsing web service (Lalitha [0085] & fig. 9) and a user agent 
invoking a wireless telephony application interface to allow devices to communicate 
(Lalitha [0083]). The combined teaching discloses web servers that communicate using 
HTTP in order to render content that is marked up using XHTML (Lalitha [0036]) and the 
retrieval and translation of web content (Lalitha [0038]). 

Re claim 44, "the proxy server comprises a number of subscriber records, and in 
that for each subscriber for which voice browsing should be supported, means for 
indication of voice browsing activation, optional key element for triggering voice 
browsing or optional hyperlink name, for insertion in accessed web content, and which, 
when selected, provides for establishment of a voice channel betwieen the ASR and the 
mobile station", the combined teaching fails to disclose subscriber records and hyperlink 
names. Gong teaches operations be initiated by a user providing a voice command to 
the voice gateway 285 telling the voice gateway 285 to navigate to a new web page 
(Gong col 1 1 line 48-60). Gong teaches a subscribe system having separate devices, 
each including one gateway, can be synchronized by keeping track of the IP addresses 
and port numbers of the separate devices, or by having the devices subscribe to the 
same topic at a publish/subscribe system (Gong col 19 line 43-55). Gong teaches a 
web server determining a hypertext markup language HTML (Gong col 5 line 52-63). 
Therefore, the combined teaching of Lalitha, Kredo, and Gong as a whole would have 
rendered obvious a proxy server with subscriber records capable of using commands or 
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key elements to trigger voice browsing, where a connection between an ASR and a 
mobile station Is established. 

Re claim 45, "if voice browsing is activated, the access request is fonwarded from 
the proxy server to the relevant Application Service Provider, which returns the 
requested content to the proxy server, and in that said proxy server comprises parsing 
and analyzing means for finding and indicating key elements, before forwarding the 
content as modified to the mobile station", the combined teaching of Lalitha, Kredo, and 
Gong disclose a proxy communicating with the Web service provider to provide 
necessary function for the user (Lalitha [0088]). The combined teaching also discloses 
a parse process having a voice recognition phase to recognize a string or strings (Gong 
fig. 15). The combined teaching also discloses messages fonA/arded to an Instant 
message proxy server and messages sent to an audio browser then sent to a mobile 
temiinal (Kredo col 2 line 11-24). Therefore, the combined teaching of Lalitha, Kredo, 
and Gong as a whole would have rendered obvious when voice browsing is activated, 
requests are forwarded from a proxy server to a service provider and back to a proxy 
server, where the proxy server parses and analyzes the data and sends it to a mobile 
station. 

Re claim 48, "keywords as recognized in voice commands from the end user are 
provided to the proxy server, and in that the proxy server comprises matching means for 
matching recognized voice commands with stored key elements, for finding the relevant 
link on which to send a. request to the Application Service Provider, and in that the 
requested content, upon reception in the proxy server, is parsed, analyzed and pushed 
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to the user agent", the combined teaching of Laiitha, Kredo, and Gong disclose the user 
agent process to retrieve natural language based on the user action such as key 
depression or voice command (Laiitha [0047]). The combined teaching of Laiitha and 
Kredo disclose a proxy server parsing a query having each possible acceptable answer 
delineated (Kredo col 1 line 49-63). The combined teaching discloses a proxy 
communicating with the Web service provider to provide necessary function for the user 
(Laiitha [0088]). A web service provider and an application service provider where data 
from the provider allows for parsing. The combined teaching also discloses a parse 
process having a voice recognition phase to recognize a string or strings (Gong fig. 15). 
The combined teaching also discloses spoken data related to input matched to stored 
data within a grammar (Gong col 2 line 1-5). The combined teaching also discloses a 
user requesting a new html page by clicking on a link with a browser and the browser 
sending the request to a synchronization controller (Gong col 16 line 1 1-26). Therefore, 
the combined teaching of Laiitha, Kredo, and Gong as a whole would have rendered 
obvious a proxy server matching voice commands with stored data to find a relevant link 
to send a request to the service provider where parsing and pushing take place prior to 
being sent to a user agent. 

Claim 49 has been analyzed and rejected with respect to claims 41,42, and 48. 
The combined teaching discloses an established connection between a proxy server, 
audio browser, and mobile station. A semaphore is construed as an object used for the 
allowance of synchronization and communication. 
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Claim 50 lias been analyzed and rejected with respect to claims 42. The 
combined teaching discloses a browser sending a request to the web server for any 
updates, where the requests are refresh requests or requests for updates and the 
browser sends the requests on a recurring basis from a send frame (Gong col 13 line 
22-26). 

Claim 51 has been analyzed and rejected with respect to claim 50. A script is 
construed as merely a set of instructions or commands. The combined teaching 
discloses an embedded JavaScript command in the refresh reply to the browser, where 
the JavaScript command instructs the browser to load a new html page (Gong col 13 
line 27-37). 

Re claim 52, "the client semaphore object is created using a WML script variable, 
fetched from the proxy server, and, in the proxy server, a first and a second version of. 
said script is stored, the first version comprising a script for semaphore activation, the 
second version comprising a script indicating semaphore inactive", the combined 
teaching discloses a browser sending a request to the web server for any updates, 
where the requests are refresh requests or requests for updates and the browser sends 
the requests on a recurring basis from a send frame (Gong col 13 line 22-26). The 
combined teaching discloses an embedded JavaScript command in the refresh reply to 
the browser, where the JavaScript command instructs the browser to load a new html 
page (Gong col 13 line 27-37). A WML or wireless markup language script is construed 
to function as a JavaScript used with a wireless area protocol. The combined teaching 
discloses a wireless environment comprising a mobile station as well as a wireless area 
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protocol (WAP) proxy and voice-xml server (Lalitha claim 23). Additionally the 
combined teaching discloses a WAP proxy server that translates html Into wml (Lalitha 
[0038]. The combined teaching also discloses spoken data related to input matched to 
stored data within a grammar (Gong col 2 line 1-5). The combined teaching also 
discloses a server-push process for synchronizing a browser after a voice gateway 
requests a VXML page (Gong col 4 line 12-14) and sends a message indicating a 
corresponding HTML page and updating an HTML page (Gong fig. 3). A semaphore is 
construed as an object used for the allowance of synchronization and communication. 
Therefore, the combined teaching of Lalitha, Kredo, and Gong as a whole would have 
rendered obvious a semaphore object created using a WML script variable from a proxy 
server storing semaphore activation version and another semaphore inactive indication 
version. 

Claim 54 has been analyzed and rejected with respect to claim 28. Claim 54 
teaches the system of the method of claim 28. Additionally, "in the enhanced proxy, the 
content by changing tag attributes to make key elements identifiable to the user", the 
, combined teaching of Lalitha and Kredo disclose web servers that communicate using 
HTTP in order to render content that is marked up using XHTML (Lalitha [0036]). A tag 
is construed as a type of markup. A speech recognizer is construed to be the same as 
a speech register. "Parsing content and analyzing paragraphs in the content to find key 
elements", the combined teaching discloses the user agent process to retrieve natural 
language based on the user action such as key depression or voice command (Lalitha 
[0047]). The combined teaching discloses a proxy server parsing a query having each 
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possible acceptable answer delineated (Kredo col 1 line 49-63). The combined 
teaching discloses that speech recognition technology is effective and reliable in 
recognizing pre-defined words and phrases permitting the formation of a limited 
vocabulary or language (Kredo col 5 line 26-40). Recognizing words and phrases is 
construed to imply analyzing a paragraph. "Opening a voice browsing session", 
opening a voice channel concurrent with a data session channel", the combined 
teaching discloses a wireless environment comprising a mobile station as well as a 
wireless area protocol (WAP) proxy and voice-xml server (Lalitha claim 23). The 
combined teaching discloses a proxy server (Kredo fig. 1) connected with an audio 
browser composed of a speech synthesizer and speech recognition software used to 
generate instructions for a telephony user (Kredo col 9 line 9-29). "In the enhanced, 
proxy server, keywords recognized in a user voice command with predefined and 
selected keywords to establish which link to use for sending a get request to the 
relevant application service provider; and processing and pushing the content received 
from the application service provider to the user agent", However the combined 
teaching fails to disclose pushing content and matching recognized key words. Gong 
discloses a parse process having a voice recognition phase to recognize a string or 
strings (Gong fig. 15). Gong also discloses spoken data related to input matched to 
stored data within a grammar (Gong col 2 line 1-5). Gong teaches a server-push 
process for synchronizing a browser after a voice gateway requests a VXML page 
(Gong col 4 line 12-14) and sends a message indicating a corresponding HTML page 
and updating an HTML page (Gong fig. 3). Therefore, the combined teaching of Lalitha, 
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Kredo, and Gong as a whole would have rendered obvious multimodal access of 
Internet content from a mobile station consisting of the steps illustrated in the remainder 
of claim 54. 
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